Pattern Matching Algorithms with Don't Cares
نویسندگان
چکیده
In this paper, we present algorithms for pattern matching, where either the pattern P or the text T can contain “don’t care” characters. If the pattern P contains don’t care characters, then we can solve the pattern matching problem in O(n +m + α) time, where α is the total number of occurrences of the component subpatterns. We also can handle online queries, given an O(n) preprocessing time, requiring O(m + α) time per query. If, on the other hand, the text T contains don’t care characters, then we can solve the problem in O(n +m + |occ(P)|) time where |occ(P)| is the total number of occurrences of P in T . The assumption that we make in this case is that the length of each component sub-text is greater than the length of the pattern.
منابع مشابه
Approximate String Matching with Variable Length Don ' t Care
Searching for DNA or amino acid sequences similar to a given pattern string is very important in molecular biology. In fact, a lot of programs and algorithms have been developed. Most of them are based on alignment of strings or approximate string matching. However, they do not seem to be adequate in some cases. For example, the DNA pattern TATA (known as TATA box) is a common promoter that oft...
متن کاملOn pattern matching with k mismatches and few don't cares
We consider the problem of pattern matching with k mismatches, where there can be don't care or wild card characters in the pattern. Specifically, given a pattern P of length m and a text T of length n, we want to find all occurrences of P in T that have no more than k mismatches. The pattern can have don't care characters, which match any character. Without don't cares, the best known algorith...
متن کاملA Filtering Algorithm for k -Mismatch with Don't Cares
We present a filtering based algorithm for the k-mismatch pattern matching problem with don’t cares. Given a text t of length n and a pattern p of length m with don’t care symbols in either p or t (but not both), and a bound k, our algorithm finds all the places that the pattern matches the text with at most k mismatches. The algorithm is deterministic and runs in Θ(nmk logm) time.
متن کاملFinding Patterns with Variable Length Gaps or Don't Cares
In this paper we have presented new algorithms to handle the pattern matching problem where the pattern can contain variable length gaps. Given a pattern P with variable length gaps and a text T our algorithm works in O(n + m + α log(max1<=i<=l(bi − ai))) time where n is the length of the text, m is the summation of the lengths of the component subpatterns, α is the total number of occurrences ...
متن کاملk -Mismatch with Don't Cares
We give the first non-trivial algorithms for the k-mismatch pattern matching problem with don’t cares. Given a text t of length n and a pattern p of length m with don’t care symbols and a bound k, our algorithms find all the places that the pattern matches the text with at most k mismatches. We first give an O(n(k + log n log log n) logm) time randomised solution which finds the correct answer ...
متن کامل